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Abstract 

We report on a search for anomalous kinematics of ti dilepton events in pp collisions at yfs = 
1.96 TeV using 193 pb _1 of data collected with the CDF II detector. We developed a new a 
priori technique designed to isolate the subset in a data sample revealing the largest deviation 
from standard model (SM) expectations and to quantify the significance of this departure. In the 
four- variable space considered, no particular subset shows a significant discrepancy and we find 
that the probability of obtaining a data sample less consistent with the SM than what is observed 
is 1.0-4.5%. 
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The discovery of the top quark during Run I of Fermilab's Tevatron collider initiated 
an experimental program to characterize its production and decay properties in all possible 
decay channels. Within the standard model (SM) the top quark decays almost exclusively to 
a W boson and a bottom quark; the "dilepton" decay channel here denotes the case where 
the two W bosons from a ti pair both decay into final states containing an electron or a 
muon, accounting for about 7% of all SM ti decays. These events are characterized by two 
energetic leptons, two jets from the hadronization of the bottom quarks, and large missing 
energy from the unobserved neutrinos. The CDF and D0 Collaborations' measurements of 
the ti production cross section in the dilepton channel in Run I [J showed a slight excess 
over SM predictions |2j. Perhaps more interestingly, several of the events observed in the 
Run I data had missing transverse energy ($ T ) and lepton p T 's [3| large enough to call into 
question their compatibility with SM top decay kinematics. In fact, it was suggested that 
the kinematics of these events could be better described by the cascade decays of heavy 
squarks |4], compelling us to subject the top dilepton sample to careful scrutiny in Run II. 

In a previous Letter , we reported a measurement of the ti production cross section in 
the dilepton channel at Run II and found good agreement with the SM expectation. Here we 
present the results of a detailed analysis of the kinematics of that data sample. Motivated by 
the possible anomalies in the top Run I dilepton sample, we devised a search for new physics 
based on the comparison of kinematic features of observed events with those expected from 
the SM, assuming a 175 GeV/c 2 top mass The search is designed to be sensitive to 
any physical process that gives rise to events with specific kinematics different from those 
expected from SM top and backgrounds, especially processes that result in kinematics similar 
to the aforementioned Run I events. The method seeks to isolate the subset of events in 
a data sample with the largest concentration of possible non-SM physics and to assign a 
probability that quantifies its departure from the SM. 

Reference jsj provides a description of the CDF-II detector, the event selection, and the 

data and simulation samples used for this analysis [7] . The basic selection requirements are 

(i) two oppositely-charged, well-identified leptons (e or fi) wither > 20 GeV/c, (ii) at least 

two jets with > 15 GeV, and (iii) Et > 25 GeV. Several other topological requirements 

are made to further purify the sample and are detailed in M. With this selection, the SM 

fj 

predicts a yield of 8.2 ± 1.1 tt events (assuming a tt cross section of 6.7 pb |2j), and 2.7 ±0.7 
events from other SM processes (mainly production of dibosons, W + associated jets, and 
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Drell-Yan events) in our sample. Thirteen events are observed. 

We consider a minimal set of assumptions about the nature of possible non-SM physics in 
order to make an a priori choice of which kinematic quantities to investigate. The Tevatron 
provides us with the opportunity to look for phenomena beyond the presently known mass 
spectrum. This together with the hints from the Run I data sample leads us to focus our 
search on events with large lepton and large Et resulting from the decay of an unknown 
heavy particle. In addition, two-body decays of massive particles {e.g. heavy chargino decay 
— > ^v) tend to result in topologies where the charged lepton and the Bt direction are 
back-to-back, whereas this tends not to be the case for the SM tt dilepton signature. Thus 
we expect the following variables to be sensitive to a wide range of new physics: the event's 
Bt , the transverse momentum of the leading (i.e. highest-pr) lepton p^, and the angle <3>£ m 
between the leading lepton and the direction of the Bt in the plane transverse to the beam. 

We define an additional kinematic variable as follows. The initial and intermediate state 
particles in the tt decay impose constraints on the final state product properties, m{l\V\) = 
^(^2^2) = m>w and m(£iUibi) = miliv^-i) — m t — 175 GeV/c 2 . These four constraints 
leave two of the six unknown neutrino momentum components unspecified when solving 
the system of kinematic equations. To fully reconstruct the event, we scan over these two 
remaining degrees of freedom and compare the resulting neutrino momentum sum (Bt ) 
with the $t measured in the event (Bt ) by computing 



pred -* obs 



Vi) (i) 



where a$ T parameterizes uncertainty on Bt due to mismeasurement of the underlying 
event. When performing the scan we assume detector resolutions to be Gaussian for the 

-it pred 

lepton and jet momenta and smear the observed values accordingly; the Et value is 
then recomputed according to the smeared jet and lepton energies. We define a variable T 
as the square root of the integral of T over the possible values of Bt determined from 
the scan and summed over a two-fold ambiguity in the lepton-b-jet pairing. This variable T 
represents how well an event's kinematics satisfy the tt dilepton decay hypothesis; a non-tt 
dilepton event has on average a small value of T compared to tt events. 

As mentioned before, we concentrate our search on events with large values of Et , Pt-> 
and <3>£ m and small values of T. We therefore assign the following weight to each event: 

W = ( Wj6t ■ w p i T ■ w* tm ■ w T ) 1/4 (2) 



where wg T , uy , w<f> em , and wt represent probabilities (assuming the SM) for an event to have 
a $t iVti ^£rn larger than that observed and a T smaller than that observed, respectively. 
We then construct 13 subsets ( "i^-subsets" ) of the data; the first subset (K = 1) contains 
only the event with the lowest weight W, the second subset (K = 2) contains only the two 
events with the two lowest weights, and so on. 

To quantify the departure of the if-subsets from the SM predictions we do a shape 
comparison using the Kolmogorov-Smirnov (KS) statistic For each of the four variables 
i, the KS deviation A^j between the SM cumulative function and the cumulative function 
of the ^-subset is computed. To assess the probability of this deviation we generate 100,000 
pseudoexperiments by randomly drawing events from large Monte Carlo samples of tt and 
SM backgrounds. The number of events corresponding to each SM process is sampled 
from a Poisson distribution with mean equal to the number of events expected after event 
selection. Only pseudoexperiments with a total of 13 events are accepted. Further, in each 
pseudoexperiment, i^-subsets are formed and the respective A K ^ for each are calculated. 
We thus build probability distribution functions for Ax,i from which the KS probability 
Pk,i can be computed. Next we calculate the geometric mean Uk of the four pk/s for each 
pseudoexperiment and form the probability distribution functions Tk such that the quantity 



determines how well each .fT-subset agrees with the SM expectation based on the combined 
information from the four variables. We define Q as the value of K with the smallest Pk- 
By isolating this "unlikely" subset Q (where "unlikely" here denotes having large p^, $t , 
$£ m and/or small T), we minimize the dilution of a possible signal from the inclusion of SM 
events. 

We use the quantity Pq as the test statistic to quantify the discrepancy of the data 
with the SM. Generating another set of 100,000 pseudoexperiments from SM Monte Carlo 
and repeating the above procedure, we determine Pq for each pseudoexperiment and build 
the probability distribution function £(Pq) such that the significance of departure of the 
Q-subset of events from the SM is 



a is the p- value of the test, representing the probability to obtain a data sample less con- 
sistent with the SM than what is actually observed. Sufficiently low values of a would 



L 



•Ftf(II) dU 



(3) 




(4) 
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indicate the presence of new physics in the data sample, and the Q events would represent 
the subsample of the data with the largest concentration of new physics. 

In order to evaluate the performance of the method, we simulated a sample of squark 

n n 

decays using PYTHIA 9] and the SUSY parameters suggested in |4j. As a performance 
benchmark, we construct a 50%:50% mixture of the SM and SUSY and ask how often we 
would observe a p-value (a) less than 0.3% (the equivalent of a 3a effect) when 13-event 
pseudoexperiments are drawn from this sample. We find that w 50% of these pseudoexper- 
iments yield a < 0.3%. Moreover, the concentration of SUSY events in the most unlikely 
if-subset found is on average 80%. By contrast, a KS test without using subsamples finds 
a < 0.3% only 21% of the time and does not isolate a mostly-SUSY subset. 

We test our procedure as well as our ability to correctly simulate our kinematic variables 
in a high-statistics control sample of 973 W+ > 3 jets events. We compare these data with a 
Monte Carlo simulation of $t , Vt an d ®em using W+ associated jet, QCD, and tt production 
processes added in the amounts expected from the SM. We apply a 3-dimensional version of 
our technique and observe that the data have a high p- value (a = 35.1%), indicating good 
modeling of the data by the simulation. 

We test the modeling of T in a control sample of W + 4 jets events, treating the lead- 
ing jet as a second lepton and the subleading jet as a second neutrino. We apply this 
reconstruction to the data and to an appropriately weighted sample of simulated tt and 

n 

ALPGEN+HERWIG W + 4 parton Monte Carlo 10]. We observe a KS probability of 
0.97 for the respective T distributions, indicating good agreement between simulation and 
the data. 

Having established that data are adequately modeled by the simulation, we apply the 
outlined technique to the tt dilepton sample. The distributions of the selected variables for 
tt dilepton events are presented in Figure [T] We find the most unlikely subset of events to 
be the entire data set {i.e. Q = 13), with a p- value = 1.6%. This result is entirely driven by 
the excess of leptons at low pt (< 40 GeV/c) seen in Figure [TJd; since the method effectively 
orders the subsets from high pr to low p?, the p- value decreases as more of the low-pr excess 
is included, reaching a minimum when the entire data sample is considered. 

A natural question to ask about the low-p^ events is whether they can be attributed 
to underestimated non-tt SM backgrounds. To address this, we used a displaced secondary 
vertex "6-tag" algorithm ^l] to look for long-lived 6-hadron decays in the events; the fraction 
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FIG. 1: St j leading lepton px, $« m , and T distributions for the top dilepton sample. The hatched 
regions represent the Poisson uncertainty on the expectation in a given bin. The dashed histograms 
are the expected distributions from the SUSY MC described in the text. 

of non-it SM dilepton events containing bottom quarks is expected to be negligible. We 
present the 6-tag content of the sample as well as the distribution of events in the (p T , T) 
plane in Figure El We note that six of the nine low-p^ events contain at least one identified 
6-jet. We also note that more than half of the low-p^ events are consistent with the tt 
kinematic hypothesis with large values of T, as opposed to the small values of T (< 0.05) 
favored by non-tt SM backgrounds (see Figure [Hi). We thus conclude that the low-pr events 
are not likely to have arisen from non-tt SM processes; details of the thirteen events can be 
found elsewhere jl^l ]. 

We next evaluate the effect of systematic uncertainties. Uncertainties in the shapes of 
kinematic distributions from sources listed in TableUlead to an uncertainty in the probability 
distribution function C(Pq), and consequently to an uncertainty in the significance level of 
our measurement. We consider each source of systematic uncertainty and build a new 
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FIG. 2: Top dilepton events in (p^, T) plane with b-tagging information, 
probability distribution function C'(Pq). We then determine a new p- value a' via 

pdata 

a'= Q C\P Q ) dP Q . (5) 
Jo 

Table H] shows the values of a' obtained for different sources of uncertainty. Generating 
an £{Pq) with the inclusion of all systematic effects that give a p- value greater than that 
observed in the data (1.6%) results in a maximum p-value of 4.5%; a minimum p-value of 
1.0% is obtained when a background estimate la lower than nominal is used. All other 
combinations of systematic effects result in p-values lying within this range. 

In conclusion, we have assessed the consistency of the ti dilepton sample with the SM in 
the four- variable space described and find a p- value of 1.0-4.5%. Our method is designed to 
be especially sensitive to data subsets that preferentially populate regions where new high- 
Pt physics can be expected. No such subset was found in our data. We have noted that the 
lepton pt distribution exhibits a mild excess at low pt; however, it can be concluded that 
new physics scenarios invoked to describe the high-p^/high-^T events observed in Run I are 
not favored by the current Run II data. 
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TABLE I: p- values obtained upon inclusion of systematic effects. The last row shows the maximum 
range of p- values resulting from various combinations of the individual systematics. 



Source of uncertainty a' (%) 
MC generator 1.6 

Initial (final) state radiation 1.2 (1.6) 
Parton distribution functions 1.9 

M top = 170 (180) GeV 1.4 (2.1) 

Jet energy scale, +1 (-1) a 2.1 (2.6) 

Background estimates, +1 (-1) a 2.7 (1.0) 

combined 1.0-4.5 
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